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Abstract 

This  paper  presents  a  security  analysis  of  Onion  Routing,  an  application  independent 
infrastructure  for  traffic-analysis-resistant  and  anonymous  Internet  connections.  It  also  in¬ 
cludes  an  overview  of  the  current  system  design,  definitions  of  security  goals  and  new  adver¬ 
sary  models. 
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1  Introduction 

This  paper  presents  a  security  analysis  of  Onion  Routing,  an  application  independent  infras¬ 
tructure  for  traffic-analysis-resistant  and  anonymous  Internet  connections.  It  also  includes  an 
overview  of  the  new  system,  definitions  of  security  goals  and  new  adversary  models.  Although 
the  conceptual  development  and  informal  arguments  about  the  security  of  Onion  Routing  have 
been  presented  elsewhere  [9,  15,  16,  10],  we  have  not  previously  attempted  to  analyze  or  quantify 
the  security  provided  against  specific  attacks  in  detail.  That  is  the  primary  contribution  of  this 
paper. 

The  primary  goal  of  Onion  Routing  is  to  provide  strongly  private  communications  in  real 
time  over  a  public  network  at  reasonable  cost  and  efficiency.  Communications  are  intended  to 
be  private  in  the  sense  that  an  eavesdropper  on  the  public  network  cannot  determine  either  the 
contents  of  messages  flowing  from  Alice  and  Bob  or  even  whether  Alice  and  Bob  are  communi¬ 
cating  with  each  other.  A  secondary  goal  is  to  provide  anonymity  to  the  sender  and  receiver,  so 
that  Alice  may  receive  messages  but  be  unable  to  identify  the  sender,  even  though  she  may  be 
able  to  reply  to  those  messages. 

An  initial  design  has  been  implemented  and  fielded  to  demonstrate  the  feasibility  of  the 
approach.  This  prototype,  which  uses  computers  operating  at  the  Naval  Research  Laboratory  in 
Washington,  D.C.,  to  simulate  a  network  of  five  Onion  Routing  nodes,  attracted  increasing  use 
over  the  two  years  it  was  available.  While  in  operation,  users  in  more  than  sixty  countries  and 
all  seven  major  US  top  level  domains  initiated  up  to  1.5  million  connections  per  month  through 
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Overall  Onion  Routing  Usage 
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Figure  1:  30  Day  Rolling  Average  of  Onion  Routing  Usage:  3/1/98  -  3/1/99 


the  prototype  system;  cf.  also  Figure  1,  which  shows  connections  per  day  averaged  over  the 
preceding  30  days.  This  demand  demonstrates  both  an  interest  in  the  service  and  the  feasibility 
of  the  approach.  However,  the  initial  prototype  lacked  a  number  of  features  needed  to  make  the 
system  robust  and  scalable,  and  to  resist  insider  attacks  or  more  extensive  eavesdropping.  A 
design  for  a  second  generation  system  that  addresses  these  issues  is  complete,  and  the  processes 
required  to  release  the  source  code  for  public  distribution  have  been  initiated.  Several  companies 
have  contacted  NRL  to  with  intent  to  commercially  license  Onion  Routing. 

This  paper  analyzes  the  protection  provided  by  the  second  generation  design.  We  start  by 
describing,  briefly,  the  architecture  and  features  of  the  second  generation  system  relevant  to  our 
analysis.  In  section  3  we  define  security  goals  for  anonymity  and/or  traffic-analysis-resistance. 
In  section  4  we  give  some  assumptions  about  the  configuration  of  our  network.  In  section  5, 
we  set  out  our  adversary  model.  In  section  6,  we  present  a  security  assessment  based  on  the 
definitions  and  assumptions  made  in  earlier  sections.  Finally,  we  compare  Onion  Routing  to 
systems  with  similar  goals,  most  specifically  with  Crowds  [17]. 
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2  Onion  Routing  Overview 


This  section  provides  a  brief  overview  of  Onion  Routing  for  readers  not  familiar  with  it.  Con¬ 
ceptual  development  of  Onion  Routing  as  well  as  a  description  of  the  design  for  the  previous 
system  can  be  found  in  [9,  16].  Brief  description  of  different  aspects  of  the  current  design  can 
be  found  in  [10,  20].  Readers  familiar  with  Onion  Routing  may  wish  to  skip  to  the  next  section. 

Onion  Routing  builds  anonymous  connections  within  a  network  of  onion  routers,  which  are, 
roughly,  real-time  Chaum  Mixes  [3].  A  Mix  is  a  store-and-forward  device  that  accepts  a  number 
of  fixed-length  messages  from  different  sources,  performs  cryptographic  transformations  on  the 
messages,  and  then  forwards  the  messages  to  the  next  destination  in  an  order  not  predictable 
from  the  order  of  inputs.  A  single  Mix  makes  tracking  of  a  particular  message  either  by  specific 
bit-pattern,  size,  or  ordering  with  respect  to  other  messages  difficult.  By  routing  through  nu¬ 
merous  Mixes  in  the  network,  determining  who  is  talking  to  whom  is  made  even  more  difficult. 
While  Chaum’s  Mixes  could  store  messages  for  an  indefinite  amount  of  time  waiting  to  receive 
an  adequate  number  of  messages  to  mix  together,  a  Core  Onion  Router  (COR)  is  designed  to 
pass  information  in  real  time,  which  limits  mixing  and  potentially  weakens  the  protection.  Large 
volumes  of  traffic  (some  of  it  perhaps  synthetic)  can  improve  the  protection  of  real  time  mixes. 

Onion  Routing  can  be  used  with  applications  that  are  proxy-aware,  as  well  as  several  non- 
proxy-aware  applications,  without  modification  to  the  applications.  Supported  protocols  include 
HTTP,  FTP,  SMTP,  rlogin,  telnet,  NNTP,  finger,  whois,  and  raw  sockets.  Proxies  have  been 
designed  but  not  development  for  SocksS,  DNS,  NFS,  IRC,  HTTPS,  SSH,  and  Virtual  Private 
Networks  (VPNs). 

The  proxy  incorporates  three  logical  layers:  an  optional  application  specific  privacy  filter,  an 
application  specific  translator  that  converts  data  streams  into  an  application  independent  format 
of  fixed  length  cells  accepted  by  the  Onion  Routing  (OR)  network,  and  an  onion  management 
layer  (the  onion  proxy)  that  builds  and  handles  the  anonymous  connections.  The  onion  proxy  is 
the  most  trusted  component  in  the  system,  because  it  knows  the  true  source  and  destination  of 
the  connections  that  it  builds  and  manages.  To  build  onions  and  hence  define  routes  the  onion 
proxy  must  know  the  topology  and  link  state  of  the  network,  the  public  certificates  of  nodes  in 
the  network,  and  the  exit  policies  of  nodes  in  the  network. 

Onion  Routing’s  anonymous  connections  are  protocol  independent  and  exist  in  three  phases: 
connection  setup,  data  movement,  and  connection  termination.  Setup  begins  when  the  initiator 
creates  an  onion,  which  defines  the  path  of  the  connection  through  the  network.  An  onion  is  a 
(recursively)  layered  data  structure  that  specifies  properties  of  the  connection  at  each  point  along 
the  route,  e.g.,  cryptographic  control  information  such  as  the  different  symmetric  cryptographic 
algorithms  and  keys  used  during  the  data  movement  phase.  Each  onion  router  along  the  route 
uses  its  private  key  to  decrypt  the  entire  onion  that  it  receives.  This  operation  exposes  the 
cryptographic  control  information  for  this  onion  router,  the  identity  of  the  next  onion  router  in 
the  path  for  this  connection,  and  the  embedded  onion.  The  onion  router  pads  the  embedded 
onion  to  maintain  a  fixed  size  and  sends  it  onward.  The  final  onion  router  in  the  path  connects 
to  a  responder  proxy,  which  will  forward  data  to  the  remote  application. 

After  the  connection  is  established,  data  can  be  sent  in  both  directions.  The  initiator’s 
onion  proxy  receives  data  from  an  application,  breaks  it  into  fixed  size  cells  (128  bytes  long, 
at  present),  and  encrypts  each  cell  multiple  times  -  once  for  each  onion  router  the  connection 
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traverses  -  using  the  algorithms  and  keys  that  were  specified  in  the  onion.  As  a  cell  of  data 
moves  through  the  anonymous  connection,  each  onion  router  removes  one  layer  of  encryption, 
so  the  data  emerges  as  plaintext  from  the  final  onion  router  in  the  path.  The  responder  proxy 
regroups  the  plaintext  cells  into  the  data  stream  originally  submitted  by  the  application  and 
forwards  it  to  the  destination.  For  data  moving  backward,  from  the  recipient  to  the  initiator, 
this  process  occurs  in  the  reverse  order,  with  the  responder  proxy  breaking  the  traffic  into  cells, 
and  successive  onion  routers  encrypting  it  using  (potentially)  different  algorithms  and  keys  than 
the  forward  path.  In  this  case  the  initiator’s  onion  proxy  decrypts  the  data  multiple  times, 
regroups  the  plaintext  cells,  and  forwards  them  to  the  application. 

Normally,  either  the  application  that  initiates  a  connection  or  the  destination  server  will 
terminate  it.  Since  onion  routers  may  fail,  however,  any  onion  router  involved  in  a  connection 
can  cause  that  connection  to  be  terminated.  To  an  application  (either  at  the  initiating  site  or  at 
the  destination),  such  a  failure  looks  the  same  as  if  the  remote  site  had  simply  closed  its  TCP 
connection. 

Longstanding  TCP  connections  (called  ‘links’  or  ‘thick  pipes’)  between  CORs  define  the 
topology  of  an  OR  network.  Links  are  negotiated  pairwise  by  CORs  in  the  course  of  becoming 
neighbors.  All  traffic  passing  over  a  link  is  encrypted  using  stream  ciphers  negotiated  by  the 
pair  of  onion  routers  on  that  link.  This  cipher  is  added  on  top  of  the  onion  layers  by  the  COR 
sending  a  cell  across  a  link  and  stripped  off  again  by  the  receiving  COR.  Since  TCP  guarantees 
sequential  delivery,  synchronization  of  the  stream  ciphers  is  not  an  issue.  To  support  a  new 
anonymous  connection,  an  onion  proxy  creates  a  random  route  within  the  current  OR  network 
topology.  The  (fixed)  size  of  an  onion  would  limit  a  route  to  a  maximum  of  11  nodes  in  the 
current  implementation.  Because  connections  can  be  tunneled,  however,  arbitrarily  long  routes 
are  possible,  even  though  they  will  become  impractical  at  some  point  because  of  the  resulting 
network  latencies. 

An  eavesdropper  or  a  compromised  onion  router  might  try  to  trace  packets  based  on  their 
content  or  on  the  timing  of  their  arrival  and  departure  at  a  node.  All  data  (onions,  content, 
and  network  control)  is  sent  through  the  Onion  Routing  network  in  uniform-sized  cells  (128 
bytes).  Because  it  is  encrypted  (or  decrypted)  as  it  traverses  each  node,  a  cell  changes  its 
appearance  (but  not  its  size)  completely  from  input  to  output.  This  prevents  an  eavesdropper 
or  a  compromised  onion  router  from  following  a  packet  based  on  its  bit  pattern  as  it  moves 
across  the  Onion  Routing  network.  In  addition,  all  cells  arriving  at  an  onion  router  within  a 
fixed  time  interval  are  collected  and  reordered  randomly  (i.e.,  “mixed”)  before  they  are  sent  to 
their  next  destinations,  in  order  to  prevent  an  eavesdropper  from  relating  an  outbound  packet 
from  a  router  with  an  earlier  inbound  one  based  on  timing  or  sequence  of  arrival. 

If  traffic  levels  are  low  and  requirements  for  real-time  transmission  are  high,  waiting  for 
enough  traffic  to  arrive  so  that  mixing  provides  good  hiding  might  cause  unacceptable  transmis¬ 
sion  delays.  In  this  case,  padding  (synthetic  traffic)  can  be  added  to  the  thick  pipes.  Conversely, 
an  attacker  might  try  to  use  a  pulse  of  traffic  to  track  cells  flowing  through  the  system.  This 
attack  can  be  made  more  difficult  by  imposing  limits  on  the  traffic  flow  over  particular  links, 
though  this  strategy  can  increase  latency. 

If  a  link  between  two  CORs  goes  down  or  comes  up,  that  information  is  propagated  among 
the  active  CORs  and  proxies  (again  using  the  fixed  cell  size,  and  with  the  same  protections  as 
other  OR  traffic).  This  information  permits  proxies  to  build  onions  with  feasible  routes.  Since 


4 


routes  are  permitted  to  have  loops  of  length  greater  than  one  hop,  the  number  of  active  nodes 
does  not  limit  the  route  length,  as  long  as  at  least  two  nodes  are  active. 

An  onion  router  cannot  tell  the  ultimate  destination  of  traffic  it  forwards  to  another  onion 
router.  The  Responder  Proxy  running  on  the  last  onion  router  in  a  path,  however,  can  determine 
where  traffic  leaving  the  OR  network  is  bound.  Some  operators  of  onion  routers  may  wish  to 
restrict  the  set  of  destinations  (non  onion-router  destinations)  to  which  their  machines  will 
forward  traffic.  For  example,  a  commercial  onion  router  might  decide  that  it  would  forward 
traffic  only  to  .com  sites,  or  a  government  onion  router  might  decide  only  to  permit  outgoing 
traffic  destined  for  .gov  sites.  We  call  this  an  “exit  policy,”  and  have  implemented  software 
so  that  sites  can  define  and  enforce  such  policies.  The  onion  proxy  creating  a  path  through 
the  OR  network  needs  information  about  exit  policies,  so  that  it  doesn’t  create  an  infeasible 
route,  and  the  second  generation  system  provides  this  information.  The  use  of  this  mechanism 
could  of  course  warp  the  traffic  flows  through  the  network  and  might  therefore  permit  some 
inferences  about  traffic  flow.  To  counteract  the  ability  of  compromised  CORs  to  lie  about 
network  topography,  public  keys,  or  exit  policies,  an  external  audit  and  verification  system  for 
this  information  has  been  built  into  every  component.  Without  both  mechanisms,  however, 
we  believe  far  fewer  institutions  would  be  willing  to  operate  ORs,  and  the  decreased  level  of 
participation  could  also  reduce  the  effectiveness  of  our  scheme. 

The  initial  OR  prototype,  since  it  was  not  intended  for  wide  deployment,  took  a  number  of 
short  cuts.  It  enforced  a  fixed  length  (five  hops)  for  all  routes.  It  did  not  provide  a  method  for 
maintaining  topology  information  or  communicating  topology  information  among  nodes.  It  did 
not  provide  padding  or  bandwidth  limiting  facilities.  All  of  these  mechanisms  are  included  in 
the  second  generation  system.  To  ease  its  widespread  distribution,  the  second  generation  system 
does  not  include  actual  cryptographic  software.  Cryptographic  functions  are  invoked  via  calls  to 
Crypto  APIs,  and  the  operator  must  provide  cryptographic  libraries  to  implement  those  APIs. 

3  Security  Goals 

Protection  of  communications  against  traffic  analysis  does  not  require  support  for  anonymous 
communication.  By  encrypting  data  sent  over  a  traffic-analysis-resistant  connection,  for  exam¬ 
ple,  endpoints  may  identify  themselves  to  one  another  without  revealing  the  existence  of  their 
communication  to  the  rest  of  the  network.  However,  traffic  analysis  is  a  potent  tool  for  reveal¬ 
ing  parties  in  conversation,  thereby  compromising  a  communication  that  was  intended  to  be 
anonymous.  Thus,  we  consider  goals  for  anonymous,  as  well  as  private,  communication.  In  fact, 
the  goals  for  these  two  cases  differ  very  little;  the  distinction  comes  in  the  specification  of  the 
adversary. 

There  are  various  basic  properties  relating  initiators,  responders,  and  connections  that  we 
wish  to  protect.  Pfitzmann  and  Waidner[22]  have  described  sender  and  receiver  anonymity 
as  respectively  hiding  the  identity  of  the  sender  or  receiver  of  a  particular  message  from  an 
attacker,  and  unlinkability  as  a  somewhat  weaker  property,  preventing  an  attacker  from  linking 
the  physical  message  sent  by  the  sender  with  the  physical  message  received  by  the  recipient.  In 
a  similar  vein,  we  define: 

Sender  activity:  the  mere  fact  that  a  sender  is  sending  something. 
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Receiver  activity:  the  mere  fact  that  a  receiver  is  receiving  something.1 

Sender  content:  that  the  sender  sent  a  particular  content. 

Receiver  content:  that  the  receiver  received  a  particular  content. 

These  are  the  basic  protections  with  which  we  will  be  concerned.  We  will  also  be  concerned 
with  more  abstract  anonymity  protection.  For  example,  it  may  be  far  more  revealing  if  these  are 
compromised  in  combination.  As  one  example,  consider  50  people  sending  the  message  “I  love 
you”  to  50  other  people,  one  each.  We  thus  have  sender  and  receiver  activity  as  well  as  sender 
and  receiver  content  revealed  for  all  of  these  messages.  However,  without  source  and  destination 
for  each  of  these,  we  don’t  know  who  loves  whom. 

One  of  the  combined  properties  that  concerns  us  is: 

Source-destination  linking  that  a  particular  source  is  sending  to  a  particular  destination.2 

This  may  or  may  not  involve  a  particular  message  or  transmission.  Building  on  our  previous 
example,  suppose  50  people  send  50  messages  each  to  50  other  people  (2500  messages  total). 
Then,  for  any  sender  and  receiver,  we  can  say  with  certainty  that  they  were  linked  on  exactly 
one  message;  although  we  may  not  be  able  to  say  which  one.  For  purposes  of  this  paper  we 
will  be  concerned  with  connections ,  specifically,  the  anonymity  properties  of  the  initiator  and 
responder  for  a  given  connection. 


4  Network  Model 

For  purposes  of  this  analysis,  an  Onion  Routing  network  consists  of  onion  proxies  (or  simply 
proxies),  Core  Onion  Routers  (CORs),  links,  over  which  CORs  pass  fixed  length  cells,  and 
responder  proxies,  which  reconstruct  cells  into  the  application  layer  data  stream. 

An  attempt  to  analyze  the  traffic  on  a  real  onion  routing  network  might  try  to  take  advantage 
of  topological  features,  exit  policies,  outside  information  about  communicants,  and  other  details 
that  we  cannot  hope  to  incorporate  in  a  mathematical  assessment  of  onion  outing  networks 
generally.  We  make  a  number  of  general  and  specific  assumptions  to  permit  us  to  proceed  with 
the  analysis.  We  also  comment  on  the  validity  of  these  assumptions  below. 

Assumption  1.  The  network  of  onion  routers  is  a  clique  (fully  connected  graph). 

Since  links  are  simply  TCP/IP  connections  traversing  the  Internet,  a  COR  can  main¬ 
tain  many  such  connections  with  relatively  little  overhead,  and  the  second  generation 

1It  may  be  useful  to  distinguish  principals  that  actually  receive  messages  from  those  that  are  the  target 
(intended  receiver)  of  a  message.  For  example,  if  a  message  is  public-key  encrypted  for  a  principal  and  broadcast 
to  this  principal  and  99  others,  then  barring  transmission  problems,  all  100  received  the  message.  However,  only 
one  was  the  intended  destination  of  the  message.  In  this  paper,  it  is  with  the  intended  receiver  that  we  are 
concerned. 

2In  [17],  ‘unlinkability’  is  limited  to  the  case  where  a  sender  and  receiver  are  both  explicitly  known  to  be  active 
and  targeted  by  an  adversary;  nonetheless  they  cannot  be  shown  to  be  communicating  with  each  other.  Onion 
Routing  does  provide  such  unlinkability  in  some  configurations,  and  depending  on  the  adversary,  but  this  is  not 
a  general  goal  for  all  connections. 
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implementation  allows  a  COR  to  have  on  the  order  of  fifty  thick  pipe  links  to  other  CORs. 
Beyond  that  size,  one  is  likely  to  find  regions  of  highly  connected  nodes  with  multiple 
bridges  between  them.  Assumption  1  thus  seems  reasonable  for  OR  networks  of  up  to  50 
CORs. 

Assumption  2.  Links  are  all  padded  or  bandwidth-limited  to  a  constant  rate. 

This  simplification  allows  us  to  ignore  passive  eavesdroppers,  since  all  an  eavesdropper 
will  see  on  any  link  is  a  constant  flow  of  fixed  length,  encrypted  cells.  In  fact,  we  expect 
that  padding  and  limiting  will  be  used  to  smooth  rapid  (and  therefore  potentially  track- 
able)  changes  in  link  traffic  rather  than  to  maintain  absolutely  fixed  traffic  flows.  Even  if 
fluctuations  could  be  observed,  no  principal  remote  from  a  link  can  identify  his  own  traffic 
as  it  passes  across  that  link,  since  each  link  is  covered  by  the  stream  cipher  under  a  key 
that  the  remote  principal  does  not  possess. 

Assumption  3.  The  exit  policy  of  any  node  is  unrestricted. 

As  noted  in  section  2,  we  expect  that  many  CORs  will  conform  to  this  assumption,  but 
some  may  not.  Restrictive  exit  policies,  to  the  extent  that  they  vary  among  CORs,  could 
affect  the  validity  of  Assumption  4,  since  the  exit  policy  will  limit  the  choice  of  the  final 
node  in  a  path.  However,  since  in  our  adversary  model  the  last  COR  may  always  be 
compromised,  it  makes  no  difference  to  the  security  of  a  connection  given  our  other  as¬ 
sumptions.  Also  note  that  this  assumption  is  independent  of  whether  or  not  the  connection 
of  some  destinations  to  the  final  COR  is  hidden,  e.g.,  by  a  firewall. 

Assumption  4.  For  each  route  through  the  OR  network  each  hop  is  chosen  at  random. 

This  assumption  depends  primarily  on  the  route  selection  algorithm  implemented  by  the 
onion  proxy,  and  secondarily  on  conflicts  between  exit  policies  and  connection  requests. 
In  practice,  we  expect  this  assumption  to  be  quite  good. 

Assumption  5.  The  number  of  nodes  in  a  route,  n,  is  chosen  from  2  <  n  <  oo  based  on  repeated 
flips  of  a  weighted  coin. 


Note  that  the  expected  route  length  is  completely  determined  by  the  weighting  of  the 
coin.  The  length  is  extended  by  one  for  each  flip  until  the  coin-flip  comes  up  on  the 
terminate-route  side — typically  the  more  lightly  weighted  side.  Thus,  for  example,  if  the 
coin  is  weighted  so  that  the  probability  of  extending  the  route  is  .8,  then  the  expected 
route  length  is  5.  Choosing  route  length  by  this  means,  as  opposed  to  choosing  randomly 
within  some  range  was  largely  motivated  by  the  Crowds  design  and  the  security  analysis 
in  [17],  of  which  we  will  say  more  below. 

Many  configurations  of  Onion  Routing  components  are  possible,  all  yielding  different  kinds 
and  degrees  of  assurance.  [20]  We  will  limit  our  analysis  to  the  two  configurations  that  we  expect 
to  be  both  the  most  common  and  the  most  widely  used. 
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In  the  remote-COR  configuration,  the  onion  proxy  is  the  only  OR  system  component 
that  runs  on  a  machine  trusted  by  the  user.  The  first  COR  (and  any  infunnels)  are  running  on 
a  remote  untrusted  machine. 

In  the  local-COR  configuration,  all  components  up  to  the  first  COR  are  running  on  locally 
trusted  machines.  This  corresponds  to  a  situation  where  a  COR  is  running  on  an  enclave  firewall, 
and  onions  might  be  built  at  individual  workstations  or  at  the  firewall  depending  on  efficiencies, 
enclave  policy,  etc.  It  also  corresponds  to  a  situation  where  an  individual  with  good  connections 
to  the  Internet  is  running  his  own  onion  router  to  reduce  the  amount  of  information  available 
to  untrusted  components.  The  important  aspect  of  this  connection  is  that  the  system  from  end 
application  to  the  first  COR  is  essentially  a  black  box.  Perhaps  contrary  to  initial  appearance, 
a  PC  using  dial-up  connection  to  a  (trustworthy)  ISP  might  naturally  be  considered  to  be  in 
this  configuration.  This  view  is  appropriate  with  respect  to  any  attacker  residing  entirely  on 
the  Internet  because  it  is  excluded  from  the  telephone  dial-up  connections  running  between  the 
customer  and  the  ISP,  and  the  ISP  is  assumed  to  be  trusted. 

We  must  also  make  assumptions  about  the  entrance  policy  of  sites.  Since  entrance  policy  is 
controlled  by  the  proxy,  it  is  natural  to  assume  that  anyone  may  connect  using  any  protocol  in 
the  remote-COR  configuration.  In  practice,  CORs  might  only  accept  connections  from  specific 
principals  (subscribers?);  although  the  COR  will  be  unable  to  determine,  hence  control,  the 
application  being  run.  In  the  local-COR  configuration,  the  entrance  policy  is  effectively  to 
exclude  all  connections  from  outside  the  black  box.  (However,  it  still  will  forward  connections 
from  any  other  COR,  and  it  is  assumed  to  have  an  open  exit  policy.)  These  assumptions  are 
then: 

Assumption  6.  Every  COR  is  connected  to  the  OR  network  and  the  outside  via  either  the 
remote-COR  configuration  or  the  local-COR  configuration,  but  not  both. 

Assumption  7.  The  entrance  policy  for  entering  the  OR  network  via  the  remote-COR  configu¬ 
ration  is  unrestricted. 

Assumption  8.  The  entrance  policy  for  entering  the  OR  network  via  the  local-COR  configura¬ 
tion  is  to  exclude  all  but  internal  connections. 

Notice  that  these  policy  assumptions  also  determine  the  initiator  being  protected  by  use  of 
the  OR  network.  For  the  remote-COR  configuration,  it  is  the  end  application  (via  its  proxy) 
that  is  being  protected.  For  the  local-COR  configuration,  it  is  the  local  COR  that  is  effectively 
the  initiator  being  protected.  This  conforms  well  with  the  possibility  that  a  corporate  or  other 
enclave  may  wish  to  protect  not  just  the  activity  of  individual  users  of  the  network  but  that  of 
the  enclave  as  a  whole.  Likewise,  an  individual  who  is  running  his  own  COR  would  clearly  want 
to  protect  connections  emanating  from  that  COR  since  he  is  the  only  possible  initiator  of  those 
connections. 


5  Adversary  Model 

One  of  the  main  challenges  in  designing  anonymous  communications  protocols  is  defining  the 
capabilities  of  the  adversary.  Given  the  tools  at  our  disposal  today,  the  adversary  model  es- 
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sentially  determines  which  salient  characteristics  the  system  should  deploy  in  order  to  defeat 
her. 

The  basic  adversaries  we  consider  are: 

Observer:  can  observe  a  connection  (e.g.,  a  sniffer  on  an  Internet  router),  but  cannot  initiate 
connections. 

Disrupter:  can  delay  (indefinitely)  or  corrupt  traffic  on  a  link. 

Hostile  user:  can  initiate  (destroy)  connections  with  specific  routes  as  well  as  varying  the 
traffic  on  the  connections  it  creates. 

Compromised  COR:  can  arbitrarily  manipulate  the  connections  under  its  control,  as  well  as 
creating  new  connections  (that  pass  through  itself). 

All  feasible  adversaries  can  be  composed  out  of  these  basic  adversaries.  This  includes  combi¬ 
nations  such  as  one  or  more  compromised  CORs  cooperating  with  disrupters  of  links  on  which 
those  CORs  are  not  adjacent,  or  such  as  combinations  of  hostile  outsiders  and  observers.  How¬ 
ever,  we  are  able  to  restrict  our  analysis  of  adversaries  to  just  one  class,  the  compromised  COR. 
We  now  justify  this  claim. 

Especially  in  light  of  our  assumption  that  the  network  forms  a  clique,  a  hostile  outsider  can 
perform  a  subset  of  the  actions  that  a  compromised  COR  can  do.  Also,  while  a  compromised 
COR  cannot  disrupt  or  observe  a  link  unless  it  is  adjacent  to  it,  any  adversary  that  replaces 
some  or  all  observers  and/or  disrupters  with  a  compromised  COR  adjacent  to  the  relevant 
link  is  more  powerful  than  the  adversary  it  replaces.  And,  in  the  presence  of  adequate  link 
padding  or  bandwidth  limiting  even  collaborating  observers  can  gain  no  useful  information  about 
connections  within  the  network.  They  may  be  able  to  gain  information  by  observing  connections 
to  the  network  (in  the  remote-COR  configuration),  but  again  this  is  less  than  what  the  COR 
to  which  such  connection  is  made  can  learn.  Thus,  by  considering  adversaries  consisting  of 
collections  of  compromised  CORs  we  cover  the  worst  case  of  all  combinations  of  basic  adversaries. 
Our  analysis  focuses  on  this  most  capable  adversary,  one  or  more  compromised  CORs. 

The  possible  distributions  of  adversaries  are 

•  single  adversary 

•  multiple  adversary:  A  fixed,  randomly  distributed  subset  of  CORs  is  compromised. 

•  roving  adversary:  A  fixed-bound  size  subset  of  CORs  is  compromised  at  any  one  time. 
At  specific  intervals,  other  CORs  can  become  compromised  or  uncompromised. 

•  global  adversary:  All  CORs  are  compromised. 

Onion  Routing  provides  no  protection  against  a  global  adversary.  If  all  the  CORs  are  com¬ 
promised,  they  can  know  exactly  who  is  talking  to  whom.  The  content  of  what  was  sent  will  be 
revealed  as  it  emerges  from  the  OR  network,  unless  it  has  been  end-to-end  encrypted  outside 
the  OR  network.  Even  a  firewall- to-firewall  connection  is  exposed  if,  as  assumed  above,  our  goal 
is  to  hide  which  local-COR  is  talking  to  which  local-COR. 
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6  Security  Assessment 


As  discussed  above,  there  are  several  possible  adversary  models.  Having  specifically  ruled  out 
the  case  of  a  global  adversary,  we  now  focus  on  the  roving  adversary  model.  (The  remaining 
models  are  subsumed  by  it.)  We  begin  the  security  assessment  by  defining  some  variables  and 
features  of  the  environment. 

Recall  that  routes  are  of  indeterminate  length  and  that  each  route  is  a  random  walk  from 
the  route  origin  through  the  network. 

We  assume  a  closed  system  composed  of  a  multitude  of  users  and  a  set  S  of  CORs.  Let  r  be 
the  total  number  of  active  CORs  in  the  system,  and — as  mentioned  in  Section  4 — let  n  be  the 
(variable)  length  of  a  specific  route  7Z  =  {i?i, ...,  Rn},  where  each  Rj  is  a  COR  in  the  route  7 Z. 
Routes  are  selected  randomly  (each  route  is  a  random  walk  from  the  Route  origin  through  the 
network)  and  hops  within  a  route  are  selected  independently  (except  cycles  of  length  one  are 
forbidden). 

Our  roving  adversary  is  characterized  by  c,  the  maximum  number  of  CORs  the  adversary  is 
able  to  corrupt  within  a  fixed  time  interval  (a  round).  At  the  end  of  each  round,  the  adversary 
can  choose  to  remain  in  place  or  shift  some  of  its  power  to  corrupt  other  CORs.  In  the  latter 
case,  previously-corrupted  CORs  are  assumed  to  be  instantly  “healed”,  i.e.,  they  resume  normal, 
secure  operation.  C,  represents  the  set  of  CORs  controlled  by  the  adversary  at  round  i  (C*  C  S). 
We  note  that  this  roving  adversary  model  closely  follows  that  found  in  the  literature  on  proactive 
cryptography,  e.g.,  [2,  13].  This  is  a  standardly  accepted  model  based  on  the  view  that  system 
locations  can  be  compromised  periodically,  but  periodic  security  checks  will  detect  compromises. 
Resulting  responses  as  well  as  periodic  system  updates,  etc.  will  return  compromised  components 
to  normal. 

At  present,  most  connections  through  an  Onion  Routing  network  are  likely  to  be  for  Web 
browsing  or  email.  Given  the  short  duration  of  typical  Web  and  email  connections,  a  static- 
attack  is  all  that  can  realistically  be  mounted;  by  the  time  a  roving  attacker  has  moved,  the 
typical  connection  will  have  closed,  leaving  no  trace  amongst  honest  CORs.  (This  is  in  contrast 
to  Crowds,  cf.  below.)  Roving  attacks  are  more  likely  to  be  effective  against  longer  telnet  or  ftp 
connections. 

We  first  analyze  connections  initiated  from  the  remote-COR  configuration  and  then  connec¬ 
tions  initiated  from  the  local-COR  configuration.  Within  each  case  we  consider  short-lived  and 
long-lived  connections. 

6.1  Assessment  for  Remote-COR  Configuration 

Given  a  route  7 Z  as  above,  suppose  that  some  but  not  all  of  the  CORs  in  the  route  are  compro¬ 
mised.  There  are  three  significant  cases: 

1.  R\  E  Cj 

The  first  node  is  compromised.  In  this  case  sender  activity  is  established  by  the  adversary. 
Sender  content  is  not  lost  since  senders  always  pre-encrypt  traffic.  The  probability  of  this 
event,  assuming  a  random  route  and  random  node  compromises,  is  P\  =  c/r. 

2.  Rn  E  C{  The  last  node  in  the  route  is  compromised.  In  this  case  receiver  content  as  well  as 
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receiver  activity  are  compromised.  Sender  content,  sender  activity,  and  source-destination 
linking  remain  protected.  The  probability  of  this  event  is  P2  =  c/r. 

3.  R\  and  Rn  G  C,;  Both  the  first  and  last  node  in  the  path  are  compromised,  so  sender /receiver 
activity  and  receiver  content  are  compromised.  Moreover,  the  COR  end-points  are  now 
able  to  correlate  cell  totals  and  compromise  source-destination  linking.  Consequently, 
sender  content  can  be  simply  inferred.  The  probability  of  this  event  is  P3  =  c2/r2  (unless 
n  =  2,  in  which  case  P3  =  c(c  —  1  )/r2  since  self-looping  is  not  allowed). 


Pi  G  Ci 

Rn  £  Ci 

Pi  and  Pn  G  C, 

sender  activity 

Yes 

No 

Yes 

receiver  activity 

No 

Yes 

Yes 

sender  content 

No 

No 

Yes  (inferred) 

receiver  content 

No 

Yes 

Yes 

source-destination  linking 

No 

No 

Yes 

probability 

c/r 

c/r 

c2/r2 

Table  1:  Properties  of  Attack  Scenarios. 


The  adversary’s  goal  must  be  to  compromise  the  endpoints,  since  he  gains  little  by  controlling 
all  intermediate  ( P,  for  1  <  %  <  n)  CORs  of  a  given  route  1Z  =  {Ri, ...,  Rn}.  In  the  case  of  short¬ 
lived  connections,  the  roving  adversary  has,  in  effect,  only  one  round  in  which  to  compromise 
the  connection,  and  succeeds  in  compromising  all  properties  of  a  connection  with  probability 
c2/r2  (or  c(c  —  l)/r2). 

We  now  consider  the  roving  adversary  against  a  long-lived  connection. 

At  route  setup  time,  the  probability  that  at  least  one  COR  on  the  route  of  length  n  is  in  C\ 
is  given  by: 

(r  -  c)n 

l-P(POCi  =  0)  =  1  -  1 

We  now  make  (a  perhaps  optimistic)  assumption  that,  if  none  of  the  CORs  on  the  route 
are  compromised  at  route  setup  time,  then  the  adversary  will  not  attempt  the  attack.  In  any 
case,  for  such  an  adversary  the  expected  number  of  rounds  must  be  at  least  one  more  than  for 
an  adversary  that  starts  with  at  least  one  compromised  COR  on  the  route.  Given  at  least  one 
compromised  COR  on  the  route,  how  many  rounds  does  it  take  for  the  adversary  to  achieve 
source-destination  linking? 

In  general,  the  attack  starts  when  one  of  the  subverted  CORs  receives  a  route  setup  request. 
She  then  proceeds  to  attempt  the  discovery  of  the  route’s  true  endpoints. 

In  either  case,  at  round  1,  the  adversary  can  establish: 

Rs  where  s  =  min(j  G  [l..n]  and  Rj  G  P  0  C\) 


as  well  as: 


Re  where  e  =  max (j  G  [l..n\  and  Rj  G  P  O  Ci) 
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While  the  actual  indices  e  and  s  are  not  known  to  the  adversary,  she  can  identify  Rs  and  Re  by 
timing  the  propagation  of  route  setup.  Moreover,  the  adversary  can  trivially  test  if  Re  =  R\  or 
R  .s  —  ^ ■ 

The  adversary’s  subsequent  optimal  strategy  is  illustrated  in  Figure  2.  As  shown  in  the 
pseudocode,  the  game  played  by  the  adversary  is  two-pronged: 

1.  In  each  round  she  moves  one  hop  closer  towards  route  endpoints  (moving  to  the  preceding 
hop  of  Rs  and  next  hop  of  Re.) 

2.  She  also  randomly  picks  a  set  of  (at  least  c  —  2)  routers  to  subvert  from  among  the 
uncorrupted  set  (which  is  constantly  updated).  When  one  of  the  end-points  is  reached, 
the  adversary  can  concentrate  on  the  other,  thus  having  c— 1  routers  to  corrupt  (at  random) 
at  each  round. 

In  the  worst  case,  it  takes  a  (c  >  2)-adversary  MAX(s,n  —  e)  rounds  to  reach  both  endpoints. 
More  generally,  the  greatest  value  of  MAX (s,  n  —  e)  is  n.  (Also,  a  single  roving  adversary  always 
takes  exactly  n  rounds  to  compromise  source  and  destination.) 

An  interesting  open  issue  is  the  expected  number  of  rounds  a  (c  >  2)-adversary  needs  in 
order  to  reach  the  route  endpoints  provided  that  she  starts  out  with  at  least  one  compromised 
COR  on  the  route.  In  lieu  of  an  analytic  solution,  it  might  be  interesting  to  run  this  for  sample 
sets  of  test  configurations  of  nodes  and  adversary  nodes  for  various  possible  (sets)  of  connections. 
We  leave  this  for  future  work. 


6.2  Assessment  For  Local-COR  Configuration 

The  local-COR  configuration  is  distinguished  from  the  remote-COR  configuration  by  the  fact 
that  the  first  COR  in  the  connection  is  assumed  to  be  immune  from  compromise.  In  the  remote- 
COR  configuration,  compromising  the  first  COR  is  the  only  way  to  compromise  sender  prop¬ 
erties  or  source-destination  linking.  In  the  local-COR  configuration,  the  only  way  that  source- 
destination  linking  or  any  sender  properties  can  be  compromised  is  if  the  adversary  can  somehow 
infer  that  the  first  COR  is  in  fact  first.  There  is  no  way  for  this  to  happen  within  our  threat 
model  unless  all  other  CORs  are  compromised.  (If  all  CORs  connected  to  the  local-COR  were 
compromised  they  could  infer  that  this  was  the  first  COR,  but  since  we  have  assumed  a  clique 
of  CORs,  this  would  imply  that  the  local-COR  is  the  only  uncompromised  COR  in  the  onion 
routing  network.) 

There  is  a  way  that  the  first  COR  could  be  identified  as  such  that  is  outside  of  our  described 
threat  model:  if  the  second  COR  is  compromised,  and  if  it  is  possible  to  predict  that  some  data 
cells  will  produce  an  immediate  response  from  the  initiator,  then  the  second  COR  may  be  able 
to  infer  that  the  first  COR  is  first  by  the  response  time.  This  possibility  is  less  remote  if  the  last 
COR  is  also  compromised  and  we  assume  that  the  data  sent  over  it  is  not  end-to-end  encrypted 
for  the  responder.  We  will  return  to  this  discussion  below. 


12 


/*  assume  at  least  one  router  initially  compromised  */ 

/*  assume  t>=2  */ 

/*  HEALTHY_ROUTERS  is  the  set  of  hereto  uncorrupted  routers  */ 

/*  remove_from_set ()  returns  set  minus  element  to  be  removed;  */ 

/*  if  an  element  not  in  set,  return  set  unchanged  */ 

/*  compute_max_span()  returns  R_s  and  R_e,  the  compromised  routers 
*  farthest  apart  on  the  route; 

R_l_found  =  R_n_found  =  false; 
available_random  =  c-2; 

HEALTHY_ROUTERS  =  ALL_ROUTERS ; 

while  (  (!  R_l_found)  &&  (!  R_n_found)) 

{ 

/*  identify  first,  last  subverted  routers  */ 
compute_max_span(R_s ,R_e) ; 

/*  note  that  it’s  possible  that  R_s=R_e  */ 
if  (  R_s==R_l  ) 

{  R_l_found  =  true; 

available_random  ++; 

} 

if  (  R_e==R_n  ) 

{  R_n_found  =  true; 

available_random  ++; 

} 

R_s  =  prev_hop  (R_s) ; 

R_e  =  next_hop  (R_e) ; 
subvert  (R_s) ; 

remove_from_set (HEALTHY_ROUTERS ,  R_s) ; 
subvert  (R_e) ; 

remove_from_set (HEALTHY_ROUTERS ,  R_e) ; 

/*  subvert  a  set  of  random  routers  */ 
for  (i=0;  i<available_random;  i++) 

{  j  =  random_router_index(HEALTHY_ROUTERS) ; 
subvert  (R_ j ) ; 

remove_f rom_set (HEALTHY_ROUTERS ,  R_ j ) ; 

> 

> 


Figure  2:  Pseudo-code  for  the  adversary’s  game. 
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7  Related  Work 


Basic  comparison  of  Onion  Routing  to  broadly  related  anonymity  mechanisms,  such  as  remailers 
[11,  5]  and  ISDN-Mixes  [14]  can  be  found  in  [16].  Also  mentioned  there  are  such  complementary 
connection-based  mechanisms  as  LPWA  [7]  and  the  Anonymizer  [1],  These  are  both  very  effective 
at  anonymizing  the  data  stream  in  different  ways,  but  they  both  pass  all  traffic  directly  from 
the  initiator  via  a  single  filtering  point  to  the  responder.  There  is  thus  minimal  protection  for 
the  anonymity  of  the  connection  itself,  which  is  our  primary  focus.  We  therefore  restrict  our 
comments  to  related  work  that  is  directed  to  wide-spread  Internet  communication  either  below 
the  application  layer  or  specifically  for  some  form  of  connection  based  traffic.  For  this  reason, 
direct  comparisons  to  local  anonymity  systems  such  as  TG  and  SS  [12]  are  omitted  due  to  their 
small  deployable  size  and  tight  timing  constraints. 

We  know  of  only  one  other  published  proposal  for  application-independent,  traffic-analysis- 
resistant  Internet  communication,  viz:  that  of  [6],  wherein  a  system  is  described  that  effectively 
builds  an  onion  for  every  IP  packet.  There  is  thus  no  connection,  either  in  the  sense  of  a 
TCP/IP  socket,  or  more  significantly,  in  the  sense  of  a  path  of  nodes  that  can  perform  fast 
(symmetric)  encryption  on  passing  traffic.  In  Onion  Routing,  computationally  expensive  public- 
key  cryptography  is  used  only  during  connection  setup.  Using  an  onion  for  every  IP  packet 
makes  for  real-time  capabilities  significantly  slower  than  those  of  Onion  Routing  and  thus  less 
applicable  to  such  things  as  telnet  connections  or  even  Web  traffic — if  loading  pages  is  expected 
to  be  anywhere  close  to  current  speeds.  On  the  other  hand,  for  applications  that  do  not  have 
these  requirements,  such  a  design  may  offer  better  security  since  there  is  no  recurrent  path  for 
packets  in  an  (application)  connection. 

A  commercial  system  that  appears  to  be  quite  similar  to  Onion  Routing  is  being  built  by 
Zero  Knowledge  Systems  (www.freedom.net).  Like  Onion  Routing,  it  establishes  a  connection 
in  the  form  of  a  path  of  routers  that  have  been  keyed  for  subsequent  data  passing;  however, 
its  underlying  transmission  is  based  on  UDP  rather  than  TCP/IP.  While  the  system  provides 
other  services,  such  as  pseudonym  management,  it  appears  to  be  limited  to  the  configuration 
comparable  to  a  customer  building  onions  and  connecting  to  an  onion  router  at  an  ISP.  Unless 
the  ISP  is  trusted,  this  constitutes  the  remote-COR  configuration.  Also,  routes  appear  to  be 
limited  to  a  fixed  length  of  three  hops.  This  means  that  the  middle  hop  knows  the  entire  route. 
Besides  being  more  vulnerable  to  a  roving  attacker,  these  observations  show  that  the  system 
is  not  suited  to  enclave  level  protection  without  some  modification  and  extension.  This  is  not 
surprising  since  the  design  appears  focused  on  the  protection  of  individual  users. 

Given  the  above,  our  remaining  comparative  comments  will  be  with  respect  to  Crowds.  We 
begin  with  a  brief  comparative  description.  The  first  thing  to  note  is  that  Crowds  is  designed 
exclusively  for  Web  traffic.  Interestingly,  though  Crowds  is  only  to  be  used  for  (short-lived) 
Web  connections,  it  make  use  of  longstanding  cryptographic  paths.  Once  a  path  is  established 
from  an  initiator  through  a  Crowd,  all  subsequent  HTTP  connections  are  passed  through  that 
path.  (The  tail  of  a  path  is  randomly  regenerated  only  beyond  a  break  point  and  only  when 
broken.  Whole  paths  are  regenerated  only  when  new  members  are  brought  into  the  Crowd.) 
This  means  that  a  path  initiator  is  less  likely  to  be  identified  than  would  be  the  case  if  a 
new  path  were  built  for  each  connection  [17].  This  is  especially  important  because,  unlike  in 
Onion  Routing,  a  compromised  node  knows  content  and  destination  of  all  connections  passing 
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through  it  (see  below).  This  means  that  adversaries  have  a  better  means  to  build  likely  profiles 
of  repeated  connections  by  the  same  initiator.  The  static  paths  of  Crowds  makes  such  profile 
information  less  useful  by  making  it  harder  to  know  which  Crowds  member  is  profiled.  If  a 
new  path  were  built  for  each  connection,  compromised  nodes  would  have  a  better  chance  of 
intersecting  predecessors  with  the  same  profile  and  thus  identifying  the  initiator.  On  the  other 
hand,  known  (rather  than  inferred)  path  profiles  in  that  case  would  be  much  less  complete, 
i.e.,  forward  anonymity  (in  the  sense  of  [21])  is  worse  for  static  paths.  Put  another  way,  in 
the  remote-COR  configuration,  assuming  a  fixed  distributed  adversary,  the  likelihood  that  some 
connection  one  makes  will  have  a  compromised  first  and  last  node  increases  over  the  number 
of  connections  made  (if  the  first  COR  is  chosen  different  each  time).  However,  the  compromise 
pertains  only  to  the  current  connection.  If  paths  are  static,  compromise  is  ongoing. 

Encrypted  traffic  looks  the  same  as  it  passes  along  a  path  through  a  Crowd.  And,  the 
decryption  key  is  available  to  all  path  participants.  Thus,  any  compromised  node  on  the  path 
compromises  both  receiver  activity  and  receiver  content.  Also,  link  padding  is  not  a  system 
option.  As  a  result,  a  local  eavesdropper  (observing  links  from  the  initiator)  and  any  one 
compromised  member  of  a  path  completely  compromise  all  properties  mentioned  in  section  3. 
A  local  eavesdropper  in  the  remote-COR  configuration  of  Onion  Routing  compromises  sender 
activity;  however,  unless  the  last  COR  in  the  connection  is  also  compromised,  nothing  else  is 
revealed.  On  the  other  hand,  in  this  configuration,  the  first  untrusted  component  of  the  system 
is  able  to  compromise  sender  activity.  And,  a  local  eavesdropper  together  with  a  compromised 
last  COR  compromise  all  properties  in  section  3.  For  Crowds  without  a  local  eavesdropper,  the 
first  untrusted  component  of  the  system  (i.e.,  the  next  node  on  the  path)  cannot  identify  the 
initiator  with  certainty,  in  fact  with  any  more  likelihood  than  that  dictated  by  the  probability 
of  forwarding  (i.e.,  the  probability  of  extending  the  path  vs.  connecting  to  the  responder). 

In  the  local-COR  configuration  Onion  Routing  provides  similar  protections  to  Crowds  of 
sender  activity,  source-destination  linking,  and  sender  content,  and  much  better  protection 
against  receiver  activity  or  receiver  content,  with  the  adversary  model  we  have  set  out  above  and 
without  a  local  eavesdropper.  If  we  add  to  the  adversary  model,  things  get  more  complicated. 

As  noted  in  section  6.2,  if  an  adversary  that  has  compromised  the  second  COR  in  a  route 
can  predict  which  data  will  prompt  an  immediate  response  from  the  initiator  (e.g.,  if  the  final 
COR  is  also  compromised),  then  she  may  be  able  to  time  responses  to  determine  that  there  can 
be  at  most  one  COR  prior  in  the  route.  This  sort  of  problem  was  recognized  early  on  in  the 
design  of  Crowds.  It  is  a  more  serious  problem  for  Crowds  because  the  second  node  alone  can 
read  requests  and  responses,  making  it  easy  to  find  the  timing-relevant  data.  In  Crowd,  if  URLs 
are  included  in  data  coming  from  a  responder  that  will  prompt  a  subsequent  request  from  the 
initiator,  nodes  later  on  the  path  will  themselves  parse  the  HTML  and  make  these  requests  back 
in  the  direction  of  the  responder,  thus  eliminating  the  timing  distinction  between  the  first  node 
and  any  others. 

Timing  information  is  thus  obscured  by  having  the  nodes  on  the  path  actively  processing  and 
filtering  data  as  well  as  managing  the  connection.  This  is  important  because  it  means  that  all 
nodes  must  be  able  to  read  all  traffic  so  that  the  data  stream  must  be  anonymous.  Therefore, 
unlike  Onion  Routing,  Crowds  inherently  cannot  be  used  in  circumstances  where  one  would 
like  to  identify  (and  possibly  authenticate)  oneself  to  the  far  end  but  would  like  to  hide  from 
others  with  whom  one  is  communicating.  Perhaps  more  importantly,  as  ever  more  functionality 
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is  added  to  Web  browsers,  and  since  nodes  on  a  Crowds  route  must  be  able  to  read  and  alter 
all  traffic  on  a  connection,  a  single  compromised  node  can  embed  requests,  either  for  identifying 
information  or  to  open  connections  directly  back  to  it — bypassing  the  crowd  and  identifying 
the  originator.  Obvious  means  of  attack  such  as  by  use  of  cookies,  or  by  Java,  Javascript, 
etc.  are  easily  shut  off  or  filtered  (along  with  their  provided  functionality).  However,  other 
more  subtle  mechanisms  are  also  available  (cf.  www.onion-router.net/Tests.html  for  a  list  of  test 
sites,  in  particular  www.onion-router.net/dynamic/snoop  ),  and  more  are  becoming  available 
all  the  time.  The  upshot  of  all  this  is  that,  for  Crowds,  the  anonymity  of  the  connection  is 
always  at  best  as  good  as  the  latest  installed  filtering  code  for  anonymity  of  the  data  stream. 
For  Onion  Routing,  these  two  functions — anonymity  of  the  connection  and  anonymity  of  the 
data  stream — are  separate,  and  a  new  means  to  identify  the  initiator  via  the  data  stream  only 
affects  the  anonymity  of  the  data  stream.  Even  then,  it  is  only  employable  at  the  far  end  of  a 
connection,  rather  than  by  any  node  on  an  anonymized  connection. 

The  last  point  of  comparison  we  discuss  is  performance.  Direct  comparison  in  practice  is 
nearly  impossible.  Each  onion  routing  connection  requires  several  public-key  decryptions.  So, 
with  respect  to  cryptographic  overhead  Crowds  has  much  better  performance  potential.  On 
the  other  hand,  within  an  onion  routing  network,  connections  are  expected  to  be  longstanding 
high-capacity  channels  between  dedicated  machines,  possibly  with  cryptographic  coprocessors. 
The  major  performance  limitation  is  often  likely  to  be  the  end  user’s  Internet  connection.  But, 
for  Crowds  the  end  users  are  the  network.  Thus,  Crowds  members  with  slow  or  intermittent 
connections  will  affect  the  performance  of  everyone  in  their  crowd.  One  can  limit  Crowds 
participation  to  those  with  longstanding,  high-speed  (say  T1  or  better)  Internet  connections. 
But,  this  will  seriously  limit  the  population  for  whom  it  is  feasible.  Depending  on  the  user 
population,  network  configuration,  and  the  components  that  make  up  the  network,  one  is  likely 
to  find  very  different  performance  numbers  in  each  of  the  systems. 

8  Conclusions  and  Future  Work 

We  have  presented  some  of  the  features  of  the  current  Onion  Routing  design  and  analyzed  its 
resistance  to  worst-case  adversaries.  This  design  generally  resists  traffic  analysis  more  effectively 
than  any  other  published  and  deployed  mechanisms  for  Internet  communication. 

We  note  some  ways  that  the  design  might  be  changed  to  improve  security:  Adding  a  time 
delay  to  traffic  at  the  proxy  could  complicate  timing  attacks  against  the  local-COR  configuration 
to  determine  the  first  COR.  (Similarly,  if  the  last  COR  is  local  to  the  responder,  in  the  sense 
of  this  paper,  then  it  would  be  possible  to  add  a  time  delay  at  the  responder  proxy.)  Of  course, 
this  is  only  necessary  when  the  goal  is  actually  to  protect  the  local  COR,  for  example  to  protect 
the  activity  of  an  enclave  or  if  the  COR  is  run  by  one  or  a  few  individuals  who  are  the  only  ones 
accessing/exiting  the  onion  routing  network  through  that  COR.  Suppose  a  typical  customer-ISP 
configuration,  in  which  the  initiator  is  someone  connecting  through  dial-up  to  an  ISP  running  an 
onion  router.  As  noted  above  in  Section  4,  this  could  be  viewed  as  a  local-COR  configuration. 
But,  in  this  case,  it  is  the  anonymity  of  the  individual  source  rather  than  the  COR  that  matters. 
Thus,  no  delay  is  necessary.  (One  could  address  a  semi-trusted  local-COR  by  building  onions 
at  the  workstation  for  a  COR,  e.g.,  at  an  ISP  or  an  enclave  firewall.  Such  options  are  discussed 
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in  [20].) 

Finally,  if  partial-route  padding  is  used  on  individual  connections,  besides  link  padding, 
then  compromise  by  even  internal  attackers  is  complicated.  For  example,  a  local  eavesdropper  or 
compromised  first  COR  (in  the  remote-COR  configuration)  would  not  be  able  to  easily  cooperate 
with  a  compromised  last  COR  to  break  source-destination  linking.  In  fact,  the  second  generation 
design  has  been  made  consistent  with  the  possibility  that  onion  proxies  can  choose  to  do  this 
via  in-channel  signaling  to  intermediate  CORs  if  they  so  desire.  Also,  long-lived  application 
connections  could  be  hopped  between  shorter-lived  Onion  Routing  connections  using  specialized 
proxies.  This  would  both  frustrate  a  roving  attacker,  and  make  such  connections  look  more  like 
short-lived  connections  even  to  network  insiders.  We  have  discussed  some  of  the  features  such 
proxies  might  have,  but  such  proxies  have  not  yet  been  designed. 
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